In this lesson, you will use R to take a closer look at the data from the LANCE MODIS Near Real Time (NRT) Global Flood Product, including learning about what are LANCE and MODIS, and the NRT Flood products available. You will then learn to select, download, and visualize one of the NRT Flood layers…
Coding Review
This lesson uses the R language and environment. R is a popular language used for statistical computing and graphics.
Learning Objectives
After completing this lesson, you should be able to:
Determine what NRT raster data is available by navigating the LANCE website.
Read a tile map and select a raster tile to download based on a point of interest.
Download near-real-time raster data using the application programming interfaces (APIs) with the GET HTTP request method (Wilson 2024).
View the downloaded raster data to quickly preview.
Classify and place on a map the NRT flood spatial data to determine areas with unusual flooding.
Subset data, perfom zonal statistics using spatial data, and graph analysis results.
Introduction
Atmospheric circulation, water evaporation, and their interactions with land surfaces can impact a region’s rainfall variability. For example, California’s winter is correlated to ocean evaporation near the west coast and eastern North Pacific, and ocean evaporation is a strong factor in increased flooding in the region (Wei et al. 2016). Additionally, drought in the region is associated to high pressure systems off the U.S. west coast, with studies finding that the high pressure system is linked to the Pacific sea surface temperature anomalies, and exacerbated by high evaporation over land due to high temperatures.
It is critical to understand the water cycle and how flooding events develop, particularly as climate change intensifies extreme weather events, because the impacts of flooding can be a risk to human life, and can disrupt infrastructure, agriculture, and natural habitats.
The MODIS/Aqua+Terra Global Flood Product L3 Near Real Time (NRT) 250m Global Flood Product (MCDWD_L3_NRT) (beta) provides daily maps of flooding globally. The product is provided over 3 compositing periods (1-day, 2-day, and 3-day) to minimize the impact of clouds and more rigorously identify flood water (the best composite will depend on the cloudiness for a particular event) (Lin et al. 2019)
What are MODIS and LANCE?
MODIS
The Moderate Resolution Imaging Spectroradiometer (MODIS) is a NASA Earth Observing System (EOS) satellite-based sensor system that creates data products including land surface temperatures, land surface reflectance, radiances, clouds, aerosols, water vapor, active fire, snow cover, sea ice measurements, and other factor information. The MODIS Near Real-Time (NRT) data includes the Flood product which is a daily ~250 m resolution product showing flood and surface water detected from the twice-daily overpass of the MODIS optical sensors.
The satellite data is readily available shortly after it is acquired by the MODIS instrument on board the Terra and Aqua satellites. This space-based instrument distinguishes 36 spectral bands and groups of wavelengths which helps map the extent of snow and ice caused by winter storms and frigid temperatures. Initially, the water-detecting algorithm is applied to both MODIS observations (Terra and Aqua). Due to cloud and terrain shadows create false positives.
To minimize errors, the product is generated with three different compositing periods (1-day, 2-day, and 3-day) to compare results and decide which product has better coverage for the event. Further, they have to differentiate floods from expected surface water through the use of MODIS Land Water Mask (MOD44W), which uses a decision tree classifier trained with MODIS data to produce a global water mask (Carroll et al. 2016).
MODIS adoption aimed to surpass barriers related to satellite data, including cost, delivery timelines, limited formats, and the need for technical expertise. The transition to GFIMS establishes an operational system at FAO, ensuring continuity in meeting NASA data-user needs (Lin et al. 2019).
LANCE
The Land, Atmosphere Near Real-time Capability for EOS (LANCE) is a NASA initiative that provides near real-time access to satellite data, including MODIS. It allows users to access the latest data within a few hours of satellite overpass, enabling rapid responses to environmental events such as floods. LANCE is particularly valuable for emergency response teams and researchers who require up-to-date information for monitoring and assessing natural disasters (Land 2024).
LANCE reduces processing time, allowing for timely computation. Users access the data through platforms like Web Map Service (WMS) and Web Coverage Service (WCS), enabling visualization and analysis for informed decision-making. This NRT approach enhances the speed and accessibility of critical information on vegetation conditions (Zhang et al. 2022).
MODIS NRT Flood MCDWD Data Products
Data Information
The main landing pages for the MODIS NRT Global Flood Product:
The MODIS/Terra+Aqua Combined MODIS Water Detection (MCDWD) algorithm is tailor-made for detecting water bodies using MODIS data obtained from both the Terra and Aqua satellites. This algorithm employs various bands and spectral information to effectively identify and categorize water bodies. This enhances the accuracy and reliability of the flood product generated (Slayback 2023).
The MODIS Near Real-Time (NRT) Flood dataset offers multiple products, each accompanied by corresponding layers. The specific layers depend on the temporal aggregation:
MCDWD_F1_L3_NRT (1-Day product) This product type is the most basic level and provides binary information about water occurrence. Pixels are classified as either containing water or not, offering a simple way to identify flooded areas.
MCDWD_F1CS_L3_NRT (1-Day CS): F1CS has a cloud shadow mask applied on the version of the MCDWD_F1_L3_NRT product
MCDWD_F2_L3_NRT (2-Day): F2 provides additional information by categorizing water occurrences into three categories: no water, low-confidence water, and high-confidence water. This classification allows for a more nuanced understanding of the extent of the flood and its associated confidence levels.
MCDWD_F3_L3_NRT (3-Day): The F3 product, based on the MCDWD (MODIS/Terra+Aqua Combined MODIS Water Detection) is an algorithm that further refines flood mapping by adding additional spectral information. These results create a more accurate representation of water bodies and flooded areas (Slayback 2023).
Exploring the Data
For this exercise, we will be using the MCDWD L3 F3 product: LANCE NRT Flood
First, install and load the R packages required for this exercise:
Package stars is already installed.
Package httr is already installed.
Package jsonlite is already installed.
Package tmap is already installed.
Package basemaps is already installed.
Package sp is already installed.
Package sf is already installed.
Package ggplot2 is already installed.
#in case tmap does not install#remotes::install_github('r-tmap/tmap')
Coding Review
This lesson uses the stars, httr, jsonlite, tmap, basemaps, and ggplot2 packages. If you’d like to learn more about the functions used in this lesson you can use the help guides on their package websites.
Based on availability, edit the tile_code variable:
#add tile code from the map above (written as h00v00)tile_code <-'h05v05'
This is the NRT Flood F3 (MCDWD_L3_F3) API URL:
# Primary ServerAPI_URL <-paste0('https://nrt3.modaps.eosdis.nasa.gov/api/v2/content/details?products=MCDWD_L3_F3_NRT&archiveSets=61&temporalRanges=')# if the primary server is down, the secondary server may be available:#API_URL <- paste0('https://nrt4.modaps.eosdis.nasa.gov/api/v2/content/details?products=MCDWD_L3_F3_NRT&archiveSets=61&temporalRanges=')
We can combine the API URL above with the year_day provided and print the available datasets:
#pasting together URL and year_dayurl <-paste0(API_URL, year_day)print(url)
Out of the response from the server, we’ll check if the response was a success with if (http_status(response)$category == "Success"). If this statement is true, then the content will be assigned to the variable data in JSON format, which is then parsed to a data frame using data_parsed <- jsonlite::fromJSON(data). The data frame contains data_parsed$content, a column with content. We filter the content by tile code using the command content_items <- data_parsed$content[grepl(tile_code, data_parsed$content$name, ignore.case = TRUE), ] and add the results to a data frame.
if (http_status(response)$category =="Success") {# Read the JSON content into a data frame df <-content(response, as ="parsed", simplifyVector =TRUE) df <- df$content} else {print("Request failed with status code", http_status(response)$status_code)}names(df)
Use the read_stars() function from the stars R Library to read the geoTiff raster. The raster is assigned to the raster_df variable. This is a web-derived layer that is in the Geographic Coordinate System (GCS) WGS84 - World Geodetic System 1984 EPSG:4326 (EPSG number 4326). We can assign the Coordinate Reference System (CRS) to the stars object:
To view the data in this classification, we’ll need to create a classified legend; however, the NRT Flood data is stored in decimal numbers (aka floating-point). Create class breaks dividing the data by the following breaks, and corresponding colors and labels. Finally, add a title for the plot that includes the year and day, year_day, and the tile_code:
#class_breaks for the table plot categorization class_breaks <-c( 0, 1, 2, 3, Inf)#colors for the plot breaks colors <-c( "gray", "blue", "yellow", "red")#labels for the plot breakslabels =c("0: No Water", "1: Surface Water", "2: Recurring flood", "3: Flood (unusual)")title =paste("Near Real-Time Flood F3", year_day, tile_code)
Generate a basemap from Esri Imagery
The basemap_stars() function from the stars package allows us to access Esri imagery layers. To generate a basemap that shows the location of our raster, we must first can deifne the bounding box with the raster_df. Choose “world_imagery” as our background and assign it to the object bm_m. the st_rgb function combines the basemap RGB image into a single file. Finally, st_transform transform the CRS into WGS84.
#define basemap using a the raster as a bounding boxbm_m <- basemaps::basemap_stars(raster_df, map_service ="esri", map_type ="world_imagery")
Loading basemap 'world_imagery' from map service 'esri'...
#The `st_rgb` function lets us turn the RGB stars item into a single imagebm_m <-st_rgb(bm_m)#bm_m <-st_transform(bm_m, 4326)
Plot basemap and NRT Flood data
Generate a plot from the tmap library using the tm_shape() function. We will plot the basemap and the raster_df items in one plot.
## tmap mode set to "plot"tmap_mode("plot")## tmap mode can also be set to "view"#tmap_mode("view")#create an object the plots the basemap and the NRT flood raster#with the tmap library, call the tm_shape() function for the basemaptm_plot <-tmap_options(max.raster =c(plot =50000, view =50000))+tm_shape(bm_m)+#the basemap bm_m is treated as a rastertm_raster()+#create a new tmap shape for the NRT flood raster with style as "cat," meaning categorical.tm_shape(raster_df, style="cat")+#add the classification styling to the raster including colors, title, class breaks and labelstm_raster( palette =c(colors),title = title, breaks = class_breaks,labels = labels )+#style the plottm_layout(legend.outside =TRUE) +tm_graticules(lines=FALSE)#view Plot tm_plot
To observe a location closer, we can create a new bounding box using latitude and longitude values, clip the data and replot on a map. Select from the map four corners in degrees including North/South, East/West information. We use these coordinates to create a matrix of points that represent four corners. We can create a bounding box from these corners. The corners are first placed in the GCS WGS84 to match out raster.
# Define the NWES coordinatesnorth <-40#negative values are used for Southsouth <-36#negative values are used for southwest <--124#negative numbers are used for Westeast <--120#negative numbers are used for West# Create a bounding boxbbox_subset <-st_bbox(c(xmin = west, ymin = south, xmax = east, ymax = north), crs =4326)
We redo the same process as above to plot the data with a basemap but with a different bounding box:
#generate a new basemapbm_m <- basemaps::basemap_stars(bbox_subset, map_service ="esri", map_type ="world_imagery")
Loading basemap 'world_imagery' from map service 'esri'...
#combine RGB bands of the bsemapbm_m <- stars::st_rgb(bm_m)#transform to CRS 4326bm_m <- sf::st_transform(bm_m, 4326)
## tmap mode set to "plot"tmap_mode("plot")## tmap mode can also be set to "view"#tmap_mode("view")#create an object the plots the basemap and the NRT flood raster#with the tmap library, call the tm_shape() function for the basemaptm_plot <-tmap_options(max.raster =c(plot =50000, view =50000))+tm_shape(bm_m, raster.downsample=TRUE)+tm_raster()+#create a new tmap shape for the NRT flood raster with style as "cat," meaning categorical.tm_shape(raster_df, style="cat", raster.downsample=TRUE)+#add the classification styling to the rastertm_raster( palette =c(colors),title = title, breaks = class_breaks,labels = labels )+#style the plottm_layout(legend.outside =TRUE) +tm_graticules(lines=FALSE)#View the plot:tm_plot
Zonal Statistics of Flood Zones
The NRT flood data can be compared to the Cropland Data Layer (CDL), which provides agricultural categories based on the Farm Service Agency (FSA) Common Land Unit (CLU) Program and is produced by the U.S. Department of Agriculture National Agricultural Statistics Service (NASS)(USDA-NASS 2023).
Download the latest CDL year of data available from the link above.
For this part of the exercise, we will use the package stars, discussed in the previous lesson, and the function read_stars() to read the CDL raster as a stars object.
#Read CDL GeoTIFFcdl<-read_stars("2022_30m_cdls.tif")#plot the raster and downsample by 100 as the resolution of the raster is high plot(cdl, downsample=100)
This raster is provided with a 30-meter resolution for the entire contiguous U.S.! Additionally, the CRS is the North American Datum of 1983 (NAD83) / Albers Conical Equal Area (AEA) EPSG:5070 projection. We can subset stars objects with other stars objects if they are the same CRS. We will first subset the NRT Flood raster and project the data into the same CRS sd the CDL raster. We can subset this raster with the NRT flood raster, but we must first crop it with the bounding box bbox_subset we created previously, and use the warp() function to change the raster from the CRS WGS84 EPSG:4326 to NAD83 EPSG:5070:
#subset the NRT Floor raster with the bounding boxraster_sub <- raster_df[bbox_subset]#warp the raster NRT flood raster to match the CDL raster CRS.raster_sub <-st_warp(raster_sub, crs=5070, use_gdal=FALSE,method ="near")#subset CDL raster with the subset NRT flood rastercdl_sub <- cdl[raster_sub]#because of the high resolution, a stars proxy object was made#make a stars object from a stars proxy object to directly read the datacdl_sub <-st_as_stars(cdl_sub)#plot the resulting CDL subsetplot(cdl_sub, downsample=100)
We are only interested in the areas that are showing as category 3, Flooded (unusual); therefore, we want to exclude other variables. We make a copy of the raster_sub variable and set variables that are not 3 as NA.
#make copy of variableraster3 <- raster_sub#set variables as NAraster3[raster3 !=3] <-NA#single break for plottingbreaks3 <-c(0,3)#plot the subset areas plot(raster3, breaks=breaks3, col="red",downsample=20)
Now that we have only the flooded areas from the NRT Flood data, we can subset the CDL data.
#subset the CDL raster where unusual flooding occurredcdl_sub3 <- cdl_sub[raster3==3]
The CDL FAQ Webpage lists the categorization, including non-agricultural categories which we can remove from the analysis:
Next, we can create a table out the remaining CDL raster data, but we first have to extract it from the stars object and convert it to a matrix. The matrix can then converted to a table and sorted.
# Convert the stars object to a matrixcdl_matrix <-as.matrix(cdl_sub3)# Count the number of pixels for each categorycdl_counts <-table(cdl_matrix)#Sort the `cdl_counts` table by descending order.cdl_counts<-sort(cdl_counts, decreasing =TRUE)
Select the top 10 CDL categories counted.
top_10 <- cdl_counts[1:10]#print top 10print(top_10)
Turn the top 10 table into a dataframe. This will allow us to more easily handle the dataframe table and data.
dataplot <-as.data.frame(top_10)
According to the NASS CDL data documentation, each of the CDL’s pixels represents 30 square meters. Additionally, a hectare is 10,000 meters. We can calculate the hectares of each CDL category by first multiplying the count of each category by 30 and dividing by 10,000.
# Calculate the area for each rowdataplot$area <- dataplot$Freq *30/10000
Once the are is calculated, we can plot a bar graph to compare the top 10 CDL Categories found in the NRT Flood zone.
#create new title for plottitle =paste("Hectares of Cropland Category under flood, day", year_day, "in tile", tile_code)# Find the max value in the table for a dynamic axis plot max_value <-max(dataplot$area) +3# use ggplot to plot the dataplot table with area # in x axis and category name in the y axis plt <-ggplot(dataplot, aes(x = area, y = cdl_matrix)) +geom_bar(stat ="identity", fill ="skyblue") +# adjust the height and colorgeom_vline(xintercept =seq(0, max_value, by =1), color ="gray", linetype ="dashed") +#create guide lineslabs(title = title, x ="Hectares (ha)", y ="Categories") +#add labels to plottheme_minimal() +#set themetheme(axis.line =element_line(size =1), # Adjust size of axis linesaxis.ticks =element_line(size =1),# Adjust size of axis tickspanel.grid.major =element_blank(), # Remove major gridlinespanel.grid.minor =element_blank(),# Remove minor gridlinesplot.background =element_rect(fill ="white"),# Set plot background colorpanel.background =element_rect(fill ="white"),# Set panel background colorpanel.border =element_rect(color ="black", fill =NA)) +# Set panel bordergeom_text(aes(label =round(area,1)), color ="black") +xlim(0, max_value) # Adjust x-axis limits # Adjust size of text labelsplt
Congratulations! Now you should be able to:
Navigate the LANCE data website and determine what data is available.
Select a tile and date to download NRT data.
Create a GET HTTP request to download near-real-time data.
Plot on a map and classify raster data to determine areas with unusual flooding.
Use NRT Flood data to subset cropland classification data, perform zonal statistics, and graph results.
Lesson 4
In this lesson we explored the LANCE MODIS Near-Rear-Time (NRT) Flood dataset. In our next lesson we will think of water at the local level and focue on New Nork State School Drinking Water datasets.
Carroll, M. L., C. M. DiMiceli, J. R. G. Townshend, R. A. Sohlberg, A. I. Elders, S. Devadiga, A. M. Sayer, and R. C. Levy. 2016. “Development of an Operational Land Water Mask for MODIS Collection 6, and Influence on Downstream Data Products.”International Journal of Digital Earth 10 (2): 207–18. https://doi.org/10.1080/17538947.2016.1232756.
Wei, Jiangfeng, Qinjian Jin, Zong-Liang Yang, and Paul A. Dirmeyer. 2016. “Role of Ocean Evaporation in California Droughts and Floods.”Geophysical Research Letters 43 (12): 6554–62. https://doi.org/10.1002/2016gl069386.
Zhang, Chen, Zhengwei Yang, Liping Di, Eugene G. Yu, Bei Zhang, Weiguo Han, Li Lin, and Liying Guo. 2022. “Near-Real-Time MODIS-Derived Vegetation Index Data Products and Online Services for CONUS Based on NASA LANCE.”Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01565-2.